2.2 Sufficiency

1 Definition

Sufficiency is a central concept in that it allows us to focus on the essential aspects of dataset while ignoring irrelevant details.

Statistic, Sufficiency

  • A statistic T(X) is any function of data X (not including parameter).
  • A statistic T(X) is sufficient (for model P) if the conditional distribution of X|T(X) is the same for all PP, i.e. independent of θ.

In short, sufficient statistics carry all information about θ.

2 Factorization Theorem

Theorem (Factorization Theorem)

Let P={Pθ|θΘ} be a model with densities pθ(x) with common measure μ. Then T(X) is sufficient iff gθ(t),h(x)0, with pθ(x)=gθ(T(x))h(x) for almost every x under μ.

Another example is orde statistics. For X1,,Xni.i.dPθ, and any model P={Pθn|θΘ} on XR. if Pθn is invariant to permutation of X=(X1,,Xn) (see exchangeability), then S(X)=(X(1),,X(n)) is sufficient.

3 Minimal Sufficiency

For the example of N(θ,1), we showed that i=1nXi is sufficient. Then 1ni=1nXi is also sufficient.
Some sufficient statistics represent more significant compressions of data than others. Like X can be recovered from S(X) but not other way around.

Proposition

T(X) is sufficient. T(X)=f(S(X)). Then S(X) is sufficient.

Minimal Sufficient

T(X) is minimal sufficient if

  1. T(X) is sufficient.
  2. T(X)=f(S(X)) for any other sufficient S(X). (almost surely in P)

We say x,yX are equivalent (denote as xpy) if pθ(x)pθ(y) does not depend on θ.

T(x)=T(y)xpy.
Theorem

T(X) is minimal sufficient if xpyT(x)=T(y).

Q.E.D.

3.1 Minimal Form

Minimal Form

Form of pη(x)=eηTT(x)A(η)h(x) is minimal if ηΞ, T(X) satisfies no linear constraints, i.e. there is no nonzero vector aRs and bR, s.t. ηTa=b,ηΞ,or T(X)Ta=Pa.s.b.

Otherwise we can represent P as an r dim exponential form for some r<s.

Proposition

If pη is a minimal form, then T(X) is minimal sufficient.

The converse of this proposition is not true.

3.2 Diagram

For case s=2, let's consider the following example:
Pasted image 20241206202340.png|400